Research Article

Incremental Instance-Oriented 3D Semantic Mapping via RGB-D Cameras for Unknown Indoor Scene

Table 2

Comparison to the 3D semantic instance segmentation approach from Voxblox++ [16] proposed by Grinvald et al. For 10 sequences from the SceneNN dataset [21], the per-class average precision (AP) is computed using an intersection over union (IoU) threshold of 0.5 over the predicted 3D segmentation masks.

Seq. IDMethodBedChairSofaTableBooksRefrigeratorTVToiletBag

011Voxblox++7550100
Ours68.767100

016Voxblox++1000.00.0
Ours750.00.0

030Voxblox++54.410055.614.3
Ours76100508.3

061Voxblox++10033.3
Ours59.933.3

078Voxblox++33.30.047.6100
Ours5010054.275

086Voxblox++800.00.0
Ours66.72550

096Voxblox++0.087.537.50.00.050
Ours0.055.739.511.10.068.7

206Voxblox++58.310060100
Ours6010055100

223Voxblox++12.575
Ours16.775

255Voxblox++75
Ours75