YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

TSP3D: Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding

This repo contains the models for paper Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding. Code is available at: https://github.com/GWxuan/TSP3D

Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding
Wenxuan Guo*, Xiuwei Xu*, Ziwei Wang, Jianjiang Feng†, Jie Zhou, Jiwen Lu

* Equal contribution † Corresponding author

In this work, we propose an efficient multi-level convolution architecture for 3D visual grounding. TSP3D achieves superior performance compared to previous approaches in both inference speed and accuracy.

Main Results

  • We provide the checkpoints for quick reproduction of the results reported in the paper.

    Benchmark Pipeline Acc@0.25 Acc@0.5 Inference Speed (FPS) Downloads
    ScanRefer Single-stage 56.45 46.71 12.43 model
    Benchmark Pipeline Acc@0.25 Acc@0.5 Downloads
    Nr3d Single-stage 48.7 37.0 model
    Sr3d Single-stage 57.1 44.1 model
  • Comparison of 3DVG methods on ScanRefer dataset:

    Figure 2

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for gwx22/TSP3D