Improve model card with pipeline tag and library
#1
by
						
nielsr
	
							HF Staff
						- opened
							
					
    	
        README.md
    CHANGED
    
    | @@ -1,17 +1,130 @@ | |
| 1 | 
             
            ---
         | 
| 2 | 
             
            license: mit
         | 
|  | |
|  | |
| 3 | 
             
            ---
         | 
| 4 | 
            -
             | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
| 5 |  | 
| 6 | 
             
            ## Usage
         | 
| 7 |  | 
| 8 | 
            -
             | 
| 9 |  | 
| 10 | 
            -
             | 
|  | |
|  | |
|  | |
|  | |
| 11 |  | 
| 12 |  | 
| 13 | 
            -
            ## Citation
         | 
| 14 |  | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
| 15 | 
             
            ```
         | 
| 16 | 
             
            @misc{huang2025unposedsparseviewsroom,
         | 
| 17 | 
             
                  title={Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model}, 
         | 
|  | |
| 1 | 
             
            ---
         | 
| 2 | 
             
            license: mit
         | 
| 3 | 
            +
            pipeline_tag: image-to-3d
         | 
| 4 | 
            +
            library_name: pytorch
         | 
| 5 | 
             
            ---
         | 
| 6 | 
            +
             | 
| 7 | 
            +
            # Plane-DUSt3R: Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model
         | 
| 8 | 
            +
             | 
| 9 | 
            +
            This model, presented in the paper [Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model](https://hf.co/papers/2502.16779), performs multi-view room layout reconstruction from unposed sparse views. It leverages the DUSt3R framework and is fine-tuned on the Structure3D dataset to estimate structural planes, offering a streamlined, end-to-end solution.
         | 
| 10 | 
            +
             | 
| 11 | 
            +
            
         | 
| 12 | 
            +
            This repository contains the official implementation of the paper "Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model". Accepted by ICLR 2025.
         | 
| 13 | 
            +
            [[arXiv]](https://arxiv.org/abs/2502.16779)
         | 
| 14 | 
            +
             | 
| 15 | 
            +
             | 
| 16 | 
            +
            ## Overview
         | 
| 17 | 
            +
             | 
| 18 | 
            +
            Plane-DUSt3R is a novel pipeline for multi-view room layout reconstruction from unposed sparse views. 
         | 
| 19 | 
            +
             | 
| 20 | 
            +
            It combines single-view plane detection with multi-view 3D reconstruction method to achieve robust and accurate plane detection in indoor scenes.
         | 
| 21 | 
            +
            
         | 
| 22 | 
            +
             | 
| 23 | 
            +
            ## Get Started
         | 
| 24 | 
            +
             | 
| 25 | 
            +
            ### Installation
         | 
| 26 | 
            +
             | 
| 27 | 
            +
             | 
| 28 | 
            +
            Create the environment, here we show an example using conda.
         | 
| 29 | 
            +
             | 
| 30 | 
            +
            ```conda create -n planedust3r python=3.11 cmake=3.14.0
         | 
| 31 | 
            +
            conda activate planedust3r 
         | 
| 32 | 
            +
            conda install pytorch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 pytorch-cuda=11.8 -c pytorch -c nvidia  # use the correct version of cuda for your system. test pass on pytorch 2.2.0
         | 
| 33 | 
            +
            cd MASt3R
         | 
| 34 | 
            +
            pip install -r requirements.txt
         | 
| 35 | 
            +
            pip install -r dust3r/requirements.txt
         | 
| 36 | 
            +
            ```
         | 
| 37 | 
            +
             | 
| 38 | 
            +
            Optional, compile the cuda kernels for RoPE (as in CroCo v2).
         | 
| 39 | 
            +
             | 
| 40 | 
            +
            ```
         | 
| 41 | 
            +
            # DUST3R relies on RoPE positional embeddings for which you can compile some cuda kernels for faster runtime.
         | 
| 42 | 
            +
            cd dust3r/croco/models/curope/
         | 
| 43 | 
            +
            python setup.py build_ext --inplace
         | 
| 44 | 
            +
            cd ../../../../
         | 
| 45 | 
            +
            ```
         | 
| 46 | 
            +
             | 
| 47 | 
            +
            ```
         | 
| 48 | 
            +
            cd ..
         | 
| 49 | 
            +
            pip install -r requirements.txt
         | 
| 50 | 
            +
            ```
         | 
| 51 | 
            +
             | 
| 52 | 
            +
            ### Checkpoints 
         | 
| 53 | 
            +
            ```
         | 
| 54 | 
            +
            mkdir -p checkpoints/
         | 
| 55 | 
            +
            ```
         | 
| 56 | 
            +
            And download the plane-dust3r checkpoint from the following google drive link:
         | 
| 57 | 
            +
            [plane-dust3r](https://drive.google.com/file/d/1sQ-IpRhfrPt4b1ZXhuPg2_dG1fnzo2SE/view?usp=sharing)
         | 
| 58 | 
            +
             | 
| 59 | 
            +
            The plane-dust3r checkpoint is also available on huggingface [huggingface](https://huggingface.co/yxuan/Plane-DUSt3R)
         | 
| 60 | 
            +
             | 
| 61 | 
            +
            And download the noncuboid checkpoints from the following google drive link:
         | 
| 62 | 
            +
            [noncuboid](https://drive.google.com/file/d/1DZnnOUMh6llVwhBvb-yo9ENVmN4o42x8/view?usp=sharing)
         | 
| 63 |  | 
| 64 | 
             
            ## Usage
         | 
| 65 |  | 
| 66 | 
            +
            ### Interactive Demo
         | 
| 67 |  | 
| 68 | 
            +
            ```
         | 
| 69 | 
            +
            python3 MASt3R/dust3r/demo.py --weights checkpoints/checkpoint-best-onlyencoder.pth
         | 
| 70 | 
            +
            # Use --weights to load a checkpoint from a local file
         | 
| 71 | 
            +
            ```
         | 
| 72 | 
            +
            
         | 
| 73 |  | 
| 74 |  | 
|  | |
| 75 |  | 
| 76 | 
            +
            ## Training
         | 
| 77 | 
            +
             | 
| 78 | 
            +
            please see branch `train`
         | 
| 79 | 
            +
             | 
| 80 | 
            +
            ## Evaluation
         | 
| 81 | 
            +
            ### Data preparation
         | 
| 82 | 
            +
            Please download Structured3D dataset from [here](https://structured3d-dataset.org/).
         | 
| 83 | 
            +
             | 
| 84 | 
            +
            The directory should be like this:
         | 
| 85 | 
            +
            ```
         | 
| 86 | 
            +
            root_path
         | 
| 87 | 
            +
            βββ scene_id_1
         | 
| 88 | 
            +
            β   βββ 2D_rendering
         | 
| 89 | 
            +
            β       βββ room_id_1
         | 
| 90 | 
            +
            β           βββ perspective
         | 
| 91 | 
            +
            β               βββ full
         | 
| 92 | 
            +
            β                   βββ position_id_1
         | 
| 93 | 
            +
            β                   β   βββ rgb_rawlight.png
         | 
| 94 | 
            +
            β                   βββ position_id_2
         | 
| 95 | 
            +
            β                   β   βββ rgb_rawlight.png
         | 
| 96 | 
            +
            β                   βββ ...
         | 
| 97 | 
            +
            βββ scene_id_2
         | 
| 98 | 
            +
                βββ 2D_rendering
         | 
| 99 | 
            +
                    βββ room_id_2
         | 
| 100 | 
            +
                        βββ perspective
         | 
| 101 | 
            +
                            βββ full
         | 
| 102 | 
            +
                                βββ position_id_1
         | 
| 103 | 
            +
                                β   βββ rgb_rawlight.png
         | 
| 104 | 
            +
                                βββ position_id_2
         | 
| 105 | 
            +
                                β   βββ rgb_rawlight.png
         | 
| 106 | 
            +
                                βββ ...
         | 
| 107 | 
            +
            ```
         | 
| 108 | 
            +
            Since we use the plane depth to evaluate the performance, we need to convert the plane layout to plane depth map.
         | 
| 109 | 
            +
            ```
         | 
| 110 | 
            +
            python convert_plane_depth.py --path /path/to/Structured3D/dataset
         | 
| 111 | 
            +
            ```
         | 
| 112 | 
            +
             | 
| 113 | 
            +
            To evaluate on test set, run:
         | 
| 114 | 
            +
            ```
         | 
| 115 | 
            +
            python evaluate_planedust3r.py \
         | 
| 116 | 
            +
                --dust3r_model checkpoints/checkpoint-best-onlyencoder.pth \
         | 
| 117 | 
            +
                --noncuboid_model checkpoints/Structured3D_pretrained.pt \
         | 
| 118 | 
            +
                --root_path /path/to/Structured3D/dataset \
         | 
| 119 | 
            +
                --save_path /path/to/save/result \
         | 
| 120 | 
            +
                --device cuda
         | 
| 121 | 
            +
            ```
         | 
| 122 | 
            +
            The evaluation will create a folder in `$save_path$` in the root directory, and save the results in `$save_path$/scene_number/room_id/`. If you dont want to save the results, you can set `--save_flag False`
         | 
| 123 | 
            +
             | 
| 124 | 
            +
             | 
| 125 | 
            +
             | 
| 126 | 
            +
            ## Citation
         | 
| 127 | 
            +
            If you find this work useful in your research, please consider citing:
         | 
| 128 | 
             
            ```
         | 
| 129 | 
             
            @misc{huang2025unposedsparseviewsroom,
         | 
| 130 | 
             
                  title={Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model}, 
         |