lch01 commited on
Commit
508da3f
·
verified ·
1 Parent(s): 423e22e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -4
README.md CHANGED
@@ -5,7 +5,39 @@ tags:
5
  pipeline_tag: image-to-3d
6
  ---
7
 
8
- This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
9
- - Code: https://github.com/wzzheng/StreamVGGT
10
- - Paper: https://arxiv.org/abs/2507.11539
11
- - Docs: https://wzzheng.net/StreamVGGT/
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  pipeline_tag: image-to-3d
6
  ---
7
 
8
+ <div align="center">
9
+ <h1>Streaming 4D Visual Geometry Transformer</h1>
10
+ </div>
11
+
12
+ ### [Paper](https://arxiv.org/abs/2507.11539) | [Project Page](https://wzzheng.net/StreamVGGT)
13
+
14
+ >Streaming 4D Visual Geometry Transformer
15
+
16
+ >Dong Zhuo<sup>\*</sup>, [Wenzhao Zheng](https://wzzheng.net/)<sup>*</sup>$\dagger$, Jiahe Guo, Yuqi Wu, [Jie Zhou](https://scholar.google.com/citations?user=6a79aPwAAAAJ&hl=en&authuser=1), [Jiwen Lu](http://ivg.au.tsinghua.edu.cn/Jiwen_Lu/)
17
+
18
+ <sup>*</sup> Equal contribution. $\dagger$ Project leader.
19
+
20
+
21
+ **StreamVGGT**, a causal transformer architecture for **real-time streaming 4D visual geometry perception** compatiable with LLM-targeted attention mechanism (e.g., [FlashAttention](https://github.com/Dao-AILab/flash-attention)), delivers both fast inference and high-quality 4D reconstruction.
22
+
23
+
24
+ ## Overview
25
+
26
+ Given a sequence of images, unlike offline models that require reprocessing the entire sequence and reconstructing the entire scene upon receiving each new image, our StreamVGGT employs temporal
27
+ causal attention and leverages cached memory token to support efficient incremental on-the-fly reconstruction, enabling interative and real-time online applitions.
28
+
29
+ ## Quick start
30
+
31
+ Please refer to our [Github Repo](https://github.com/wzzheng/StreamVGGT).
32
+
33
+ ## Citation
34
+
35
+ If you find this project helpful, please consider citing the following paper:
36
+ ```
37
+ @article{streamVGGT,
38
+ title={Streaming 4D Visual Geometry Transformer},
39
+ author={Dong Zhuo and Wenzhao Zheng and Jiahe Guo and Yuqi Wu and Jie Zhou and Jiwen Lu},
40
+ journal={arXiv preprint arXiv:2507.11539},
41
+ year={2025}
42
+ }
43
+ ```