nielsr HF Staff commited on
Commit
3b9dfbd
Β·
verified Β·
1 Parent(s): 058f800

Improve model card with pipeline tag and library

Browse files

This PR improves the model card by:

- Adding the `pipeline_tag: image-to-3d` to better categorize the model.
- Specifying the `library_name: pytorch` to indicate the framework used.

Files changed (1) hide show
  1. README.md +117 -4
README.md CHANGED
@@ -1,17 +1,130 @@
1
  ---
2
  license: mit
 
 
3
  ---
4
- # Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
 
6
  ## Usage
7
 
8
- For detailed usage instructions and documentation of this model, please refer to our GitHub repository:
9
 
10
- [GitHub Repository](https://github.com/justacar/Plane-DUSt3R)
 
 
 
 
11
 
12
 
13
- ## Citation
14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  ```
16
  @misc{huang2025unposedsparseviewsroom,
17
  title={Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model},
 
1
  ---
2
  license: mit
3
+ pipeline_tag: image-to-3d
4
+ library_name: pytorch
5
  ---
6
+
7
+ # Plane-DUSt3R: Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model
8
+
9
+ This model, presented in the paper [Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model](https://hf.co/papers/2502.16779), performs multi-view room layout reconstruction from unposed sparse views. It leverages the DUSt3R framework and is fine-tuned on the Structure3D dataset to estimate structural planes, offering a streamlined, end-to-end solution.
10
+
11
+ ![Overview](assets/teaser.png)
12
+ This repository contains the official implementation of the paper "Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model". Accepted by ICLR 2025.
13
+ [[arXiv]](https://arxiv.org/abs/2502.16779)
14
+
15
+
16
+ ## Overview
17
+
18
+ Plane-DUSt3R is a novel pipeline for multi-view room layout reconstruction from unposed sparse views.
19
+
20
+ It combines single-view plane detection with multi-view 3D reconstruction method to achieve robust and accurate plane detection in indoor scenes.
21
+ ![Overview](assets/architecture.png)
22
+
23
+ ## Get Started
24
+
25
+ ### Installation
26
+
27
+
28
+ Create the environment, here we show an example using conda.
29
+
30
+ ```conda create -n planedust3r python=3.11 cmake=3.14.0
31
+ conda activate planedust3r
32
+ conda install pytorch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 pytorch-cuda=11.8 -c pytorch -c nvidia # use the correct version of cuda for your system. test pass on pytorch 2.2.0
33
+ cd MASt3R
34
+ pip install -r requirements.txt
35
+ pip install -r dust3r/requirements.txt
36
+ ```
37
+
38
+ Optional, compile the cuda kernels for RoPE (as in CroCo v2).
39
+
40
+ ```
41
+ # DUST3R relies on RoPE positional embeddings for which you can compile some cuda kernels for faster runtime.
42
+ cd dust3r/croco/models/curope/
43
+ python setup.py build_ext --inplace
44
+ cd ../../../../
45
+ ```
46
+
47
+ ```
48
+ cd ..
49
+ pip install -r requirements.txt
50
+ ```
51
+
52
+ ### Checkpoints
53
+ ```
54
+ mkdir -p checkpoints/
55
+ ```
56
+ And download the plane-dust3r checkpoint from the following google drive link:
57
+ [plane-dust3r](https://drive.google.com/file/d/1sQ-IpRhfrPt4b1ZXhuPg2_dG1fnzo2SE/view?usp=sharing)
58
+
59
+ The plane-dust3r checkpoint is also available on huggingface [huggingface](https://huggingface.co/yxuan/Plane-DUSt3R)
60
+
61
+ And download the noncuboid checkpoints from the following google drive link:
62
+ [noncuboid](https://drive.google.com/file/d/1DZnnOUMh6llVwhBvb-yo9ENVmN4o42x8/view?usp=sharing)
63
 
64
  ## Usage
65
 
66
+ ### Interactive Demo
67
 
68
+ ```
69
+ python3 MASt3R/dust3r/demo.py --weights checkpoints/checkpoint-best-onlyencoder.pth
70
+ # Use --weights to load a checkpoint from a local file
71
+ ```
72
+ ![Demo interface](assets/demo.jpg)
73
 
74
 
 
75
 
76
+ ## Training
77
+
78
+ please see branch `train`
79
+
80
+ ## Evaluation
81
+ ### Data preparation
82
+ Please download Structured3D dataset from [here](https://structured3d-dataset.org/).
83
+
84
+ The directory should be like this:
85
+ ```
86
+ root_path
87
+ └── scene_id_1
88
+ β”‚ └── 2D_rendering
89
+ β”‚ └── room_id_1
90
+ β”‚ └── perspective
91
+ β”‚ └── full
92
+ β”‚ β”œβ”€β”€ position_id_1
93
+ β”‚ β”‚ └── rgb_rawlight.png
94
+ β”‚ β”œβ”€β”€ position_id_2
95
+ β”‚ β”‚ └── rgb_rawlight.png
96
+ β”‚ └── ...
97
+ └── scene_id_2
98
+ └── 2D_rendering
99
+ └── room_id_2
100
+ └── perspective
101
+ └── full
102
+ β”œβ”€β”€ position_id_1
103
+ β”‚ └── rgb_rawlight.png
104
+ β”œβ”€β”€ position_id_2
105
+ β”‚ └── rgb_rawlight.png
106
+ └── ...
107
+ ```
108
+ Since we use the plane depth to evaluate the performance, we need to convert the plane layout to plane depth map.
109
+ ```
110
+ python convert_plane_depth.py --path /path/to/Structured3D/dataset
111
+ ```
112
+
113
+ To evaluate on test set, run:
114
+ ```
115
+ python evaluate_planedust3r.py \
116
+ --dust3r_model checkpoints/checkpoint-best-onlyencoder.pth \
117
+ --noncuboid_model checkpoints/Structured3D_pretrained.pt \
118
+ --root_path /path/to/Structured3D/dataset \
119
+ --save_path /path/to/save/result \
120
+ --device cuda
121
+ ```
122
+ The evaluation will create a folder in `$save_path$` in the root directory, and save the results in `$save_path$/scene_number/room_id/`. If you dont want to save the results, you can set `--save_flag False`
123
+
124
+
125
+
126
+ ## Citation
127
+ If you find this work useful in your research, please consider citing:
128
  ```
129
  @misc{huang2025unposedsparseviewsroom,
130
  title={Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model},