α-OCC: Uncertainty-Aware Camera-based 3D Semantic Occupancy Prediction

Transactions on Machine Learning Research

1Department of Computer Science and Engineering, University of Connecticut
2Tandon School of Engineering, New York University
3Adjunct Professor, New York University



(a): As the percentage of depth uncertainty increases, the accuracy (mIoU) of OCC decreases significantly. (b): High class imbalance on OCC. The percentage next to each class is its percentage in the SemanticKITTI dataset. Since the safety-critical class ''bicyclist'' only occupied 0.01%, the trained OCC model fails to detect the bicyclist in front, leading to a crash. However, after quantifying the uncertainty and post-processing using our HCP, the crash is avoided, for our HCP improves the occupied recall of rare classes. When applying our Depth-UP and HCP together, safety is further enhanced as the bicyclist is more accurately identified. In contrast, using only HCP often assigns the highest probability to the car (blue) for many bicyclist voxels. Due to visualization limits, each occupied voxel shows the most probable nonempty class in the predicted set from HCP.

Abstract

Comprehending 3D scenes is paramount for tasks such as planning and mapping for autonomous vehicles and robotics. Camera-based 3D Semantic Occupancy Prediction (OCC) aims to infer scene geometry and semantics from limited observations. While it has gained popularity due to affordability and rich visual cues, existing methods often neglect the inherent uncertainty in models. To address this, we propose an uncertainty-aware OCC method (α-OCC). We first introduce Depth-UP, an uncertainty propagation framework that improves geometry completion by up to 11.58% and semantic segmentation by up to 12.95% across various OCC models. For uncertainty quantification (UQ), we propose the hierarchical conformal prediction (HCP) method, effectively handling the high-level class imbalance in OCC datasets. On the geometry level, the novel KL-based score function significantly improves the occupied recall (45%) of safety-critical classes with minimal performance overhead (3.4% reduction). On UQ, our HCP achieves smaller prediction set sizes while maintaining the defined coverage guarantee. Compared with baselines, it reduces up to 90% set size, with 18% further reduction when integrated with Depth-UP. Our contributions advance OCC accuracy and robustness, marking a noteworthy step forward in autonomous perception systems.

Contribution

  • To the best of our knowledge, we are the first to propose the uncertainty propagation framework Depth-UP to improve OCC performance. Our approach leverages uncertainty quantified through direct modeling to improve both geometry completion and semantic segmentation, resulting in substantial performance gains across common OCC models.
  • To solve the high-level class imbalance challenge on OCC, resulting in biased prediction and low recall for rare classes, we propose the HCP. On geometry completion, a novel KL-based score function is proposed to improve the occupied recall of safety-critical classes with little performance overhead. For UQ, we achieve a smaller prediction set size under the defined class coverage guarantee.
  • Our α-OCC shows that uncertainty is an integral and vital part of OCC tasks. Integrating Depth-UP (propagating depth uncertainty to OCC) and HCP (quantifying OCC uncertainty) enhances both accuracy and uncertainty of OCC models.

  • Method



    Overview of our α-OCC method. The non-black colors highlight the novelties and important techniques in our method. C denotes the concatenation of the depth feature FD and image feature FI. In the Depth-UP part, we calculate the uncertainty of depth estimation through direct modeling. For depth model retraining, we only train the additional standard deviation head while keeping the rest of the model frozen. Then we propagate it through depth feature extraction (for semantic segmentation) and building a probabilistic voxel grid map Mp by probabilistic geometry projection (for geometry completion). Each element of Mp is the occupied probability of the corresponding voxel, computed by considering the depth distribution of all rays across the voxel.



    Overview of our HCP method. We predict voxels' occupied state by the quantile on the KL-based score, which can improve occupied recall of rare classes, and then only generate prediction sets for these predicted occupied voxels. The occupied quantile qoy and semantic quantile qsy are computed during the calibration step of HCP.


    Qualitative  Results

     


    Qualitative results of the base VoxFormer model and that with our Depth-UP.

    BibTeX

    @article{Su2026alpha_occ,
          author    = {Su, Sanbao and Chen, Nuo and Lin, Chenchen and Juefei-Xu, Felix and Feng, Chen and Miao, Fei},
          title     = {$\alpha$-OCC: Uncertainty-Aware Camera-based 3D Semantic Occupancy Prediction},
          year      = {2026},
          booktitle = {Transactions on Machine Learning Research}
      }