saifkhichi96 commited on
Commit
3c90485
Β·
verified Β·
1 Parent(s): f87a6a4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +264 -1
README.md CHANGED
@@ -2,4 +2,267 @@
2
  license: cc-by-nc-4.0
3
  datasets:
4
  - saifkhichi96/spinetrack
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: cc-by-nc-4.0
3
  datasets:
4
  - saifkhichi96/spinetrack
5
+ base_model:
6
+ - Tau-J/RTMPose
7
+ tags:
8
+ - 2d-human-pose-estimation
9
+ - computer-vision
10
+ - keypoint-detection
11
+ - spinepose
12
+ - spinetrack
13
+ language:
14
+ - en
15
+ ---
16
+
17
+ # 🩻 Model Card for **SpinePose** Family
18
+
19
+ **SpinePose** is a family of 2D human pose estimation models trained to estimate a **37-keypoint skeleton**, extending standard human body models to include the **spine**, **pelvis**, and **feet** regions in detail.
20
+ It offers high anatomical precision for **biomechanical analysis**, **ergonomic assessment**, and **clinical pose tracking**, while maintaining compatibility with COCO-style keypoint definitions.
21
+
22
+ ---
23
+
24
+ ## πŸ“˜ Model Details
25
+
26
+ ### Description
27
+
28
+ - **Developed by:** [Muhammad Saif Ullah Khan](https://saifkhichi.com/)
29
+ - **Affiliation:** Technical University of Kaiserslautern & [DFKI](https://av.dfki.de/)
30
+ - **Funding:** DFKI GmbH
31
+ - **Model Type:** Top-down 2D keypoint estimator
32
+ - **License:** [CC-BY-NC-4.0](https://creativecommons.org/licenses/by-nc/4.0/)
33
+ - **Frameworks:** PyTorch, ONNX Runtime
34
+ - **Input Resolution:** 256Γ—192 or 384Γ—288 (depending on variant)
35
+
36
+ ### Sources
37
+
38
+ - **Repository:** [github.com/dfki-av/spinepose](https://github.com/dfki-av/spinepose)
39
+ - **Paper:** [CVPR Workshops 2025 (CVSPORTS)](https://openaccess.thecvf.com/content/CVPR2025W/CVSPORTS/html/Khan_Towards_Unconstrained_2D_Pose_Estimation_of_the_Human_Spine_CVPRW_2025_paper.html)
40
+ - **Demo:** [saifkhichi.com/research/spinepose](https://www.saifkhichi.com/research/spinepose/)
41
+
42
+ ---
43
+
44
+ ## Intended Uses
45
+
46
+ ### Direct Use
47
+ - Human body and spine joint localization from RGB images or videos
48
+ - Real-time motion analysis for research, animation, or sports applications
49
+ - Augmentation of general-purpose pose estimators for anatomically rich tasks
50
+
51
+ ### Downstream Use
52
+ - Integration with clinical posture tracking systems
53
+ - 3D pose lifting or musculoskeletal modeling (via SpineTrack synthetic subset)
54
+ - Fine-tuning on domain-specific datasets (industrial, rehabilitation, yoga)
55
+
56
+ ### Out-of-Scope Use
57
+ - Any medical diagnosis or treatment application without human oversight
58
+ - Full-body 3D reconstruction (requires separate lifting model)
59
+ - Unverified use in safety-critical systems
60
+
61
+ ---
62
+
63
+ ## Bias, Risks, and Limitations
64
+
65
+ - Model trained primarily on controlled and synthetic datasets; may underperform in occluded or extreme poses.
66
+ - Limited diversity in body types and cultural attire representation.
67
+ - Bias inherited from COCO/Body8 datasets used for pretraining the teachers.
68
+
69
+ ### Recommendations
70
+ Evaluate the model on your specific domain and retrain or augment using domain-specific samples to mitigate dataset bias.
71
+
72
+ ---
73
+
74
+ ## Getting Started
75
+
76
+ ### Installation
77
+
78
+ ```bash
79
+ pip install spinepose
80
+ ```
81
+
82
+ On Linux/Windows with CUDA available, install the GPU version:
83
+
84
+ ```bash
85
+ pip install spinepose[gpu]
86
+ ```
87
+
88
+ ### CLI Usage
89
+
90
+ ```bash
91
+ spinepose -i /path/to/image_or_video -o /path/to/output
92
+ ```
93
+
94
+ This automatically downloads the correct ONNX checkpoint.
95
+ Run `spinepose -h` for detailed usage options.
96
+
97
+ ### Python API
98
+
99
+ ```python
100
+ import cv2
101
+ from spinepose import SpinePoseEstimator
102
+
103
+ # Initialize estimator (downloads ONNX model if not found locally)
104
+ estimator = SpinePoseEstimator(device='cuda')
105
+
106
+ # Perform inference on a single image
107
+ image = cv2.imread('path/to/image.jpg')
108
+ keypoints, scores = estimator.predict(image)
109
+ visualized = estimator.visualize(image, keypoints, scores)
110
+ cv2.imwrite('output.jpg', visualized)
111
+ ```
112
+
113
+ For higher-level use:
114
+
115
+ ```python
116
+ from spinepose.inference import infer_image, infer_video
117
+
118
+ # Single image inference
119
+ infer_image('path/to/image.jpg', 'output.jpg')
120
+
121
+ # Video inference with optional temporal smoothing
122
+ infer_video('path/to/video.mp4', 'output_video.mp4', use_smoothing=True)
123
+ ```
124
+
125
+ ## Evaluation
126
+
127
+ To reproduce results, prepare the following directory layout:
128
+
129
+ ```plaintext
130
+ <PROJECT_DIR>/
131
+ β”œβ”€ data/
132
+ β”‚ β”œβ”€ spinetrack/
133
+ β”‚ β”œβ”€ coco/
134
+ β”‚ └─ halpe/
135
+ └─ checkpoints/
136
+ β”œβ”€ spinepose-s_32xb256-10e_spinetrack-256x192.pth
137
+ β”œβ”€ spinepose-m_32xb256-10e_spinetrack-256x192.pth
138
+ β”œβ”€ spinepose-l_32xb256-10e_spinetrack-256x192.pth
139
+ └─ spinepose-x_32xb128-10e_spinetrack-384x288.pth
140
+ ```
141
+
142
+ Each PyTorch checkpoint contains both `teacher` and `student` weights, with only the `student` used during inference. Exported ONNX checkpoints only contain the `student`.
143
+
144
+ ### Metrics
145
+
146
+ We report **Average Precision (AP)** and **Average Recall (AR)** under varying Object Keypoint Similarity (OKS) thresholds, consistent with COCO conventions but extended to the 37-keypoint SpineTrack format.
147
+
148
+ ### Results
149
+
150
+ <table border="1" cellspacing="0" cellpadding="6" style="border-collapse:collapse; text-align:center; font-family:Arial; font-size:13px;">
151
+ <thead style="background-color:#f0f0f0; font-weight:bold;">
152
+ <tr>
153
+ <th>Method</th>
154
+ <th>Train Data</th>
155
+ <th>Kpts</th>
156
+ <th colspan="2">COCO</th>
157
+ <th colspan="2">Halpe26</th>
158
+ <th colspan="2">Body</th>
159
+ <th colspan="2">Feet</th>
160
+ <th colspan="2">Spine</th>
161
+ <th colspan="2">Overall</th>
162
+ <th>Params (M)</th>
163
+ <th>FLOPs (G)</th>
164
+ </tr>
165
+ <tr>
166
+ <th></th><th></th><th></th>
167
+ <th>AP</th><th>AR</th>
168
+ <th>AP</th><th>AR</th>
169
+ <th>AP</th><th>AR</th>
170
+ <th>AP</th><th>AR</th>
171
+ <th>AP</th><th>AR</th>
172
+ <th>AP</th><th>AR</th>
173
+ <th></th><th></th>
174
+ </tr>
175
+ </thead>
176
+ <tbody>
177
+ <tr><td>SimCC-MBV2</td><td>COCO</td><td>17</td><td>62.0</td><td>67.8</td><td>33.2</td><td>43.9</td><td>72.1</td><td>75.6</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.1</td><td>0.1</td><td>2.29</td><td>0.31</td></tr>
178
+ <tr><td>RTMPose-t</td><td>Body8</td><td>26</td><td>65.9</td><td>71.3</td><td>68.0</td><td>73.2</td><td>76.9</td><td>80.0</td><td>74.1</td><td>79.7</td><td>0.0</td><td>0.0</td><td>15.8</td><td>17.9</td><td>3.51</td><td>0.37</td></tr>
179
+ <tr><td>RTMPose-s</td><td>Body8</td><td>26</td><td>69.7</td><td>74.7</td><td>72.0</td><td>76.7</td><td>80.9</td><td>83.6</td><td>78.9</td><td>83.5</td><td>0.0</td><td>0.0</td><td>17.2</td><td>19.4</td><td>5.70</td><td>0.70</td></tr>
180
+ <tr style="background-color:#e6e6e6; font-weight:bold;"><td>SpinePose-s</td><td>SpineTrack</td><td>37</td><td>68.2</td><td>73.1</td><td>70.6</td><td>75.2</td><td>79.1</td><td>82.1</td><td>77.5</td><td>82.9</td><td>89.6</td><td>90.7</td><td>84.2</td><td>86.2</td><td>5.98</td><td>0.72</td></tr>
181
+ <tr><td colspan="17" style="background-color:#d0d0d0; height:3px;"></td></tr>
182
+ <tr><td>SimCC-ViPNAS</td><td>COCO</td><td>17</td><td>69.5</td><td>75.5</td><td>36.9</td><td>49.7</td><td>79.6</td><td>83.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.2</td><td>0.2</td><td>8.65</td><td>0.80</td></tr>
183
+ <tr><td>RTMPose-m</td><td>Body8</td><td>26</td><td>75.1</td><td>80.0</td><td>76.7</td><td>81.3</td><td>85.5</td><td>87.9</td><td>84.1</td><td>88.2</td><td>0.0</td><td>0.0</td><td>19.4</td><td>21.4</td><td>13.93</td><td>1.95</td></tr>
184
+ <tr style="background-color:#e6e6e6; font-weight:bold;"><td>SpinePose-m</td><td>SpineTrack</td><td>37</td><td>73.0</td><td>77.5</td><td>75.0</td><td>79.2</td><td>84.0</td><td>86.4</td><td>83.5</td><td>87.4</td><td>91.4</td><td>92.5</td><td>88.0</td><td>89.5</td><td>14.34</td><td>1.98</td></tr>
185
+ <tr><td colspan="17" style="background-color:#d0d0d0; height:3px;"></td></tr>
186
+ <tr><td>RTMPose-l</td><td>Body8</td><td>26</td><td>76.9</td><td>81.5</td><td>78.4</td><td>82.9</td><td>86.8</td><td>89.2</td><td>86.9</td><td>90.0</td><td>0.0</td><td>0.0</td><td>20.0</td><td>22.0</td><td>28.11</td><td>4.19</td></tr>
187
+ <tr><td>RTMW-m</td><td>Cocktail14</td><td>133</td><td>73.8</td><td>78.7</td><td>63.8</td><td>68.5</td><td>84.3</td><td>86.7</td><td>83.0</td><td>87.2</td><td>0.0</td><td>0.0</td><td>6.2</td><td>7.6</td><td>32.26</td><td>4.31</td></tr>
188
+ <tr><td>SimCC-ResNet50</td><td>COCO</td><td>17</td><td>72.1</td><td>78.2</td><td>38.7</td><td>51.6</td><td>81.8</td><td>85.2</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.2</td><td>0.2</td><td>36.75</td><td>5.50</td></tr>
189
+ <tr style="background-color:#e6e6e6; font-weight:bold;"><td>SpinePose-l</td><td>SpineTrack</td><td>37</td><td>75.2</td><td>79.5</td><td>77.0</td><td>81.1</td><td>85.4</td><td>87.7</td><td>85.5</td><td>89.2</td><td>91.0</td><td>92.2</td><td>88.4</td><td>90.0</td><td>28.66</td><td>4.22</td></tr>
190
+ <tr><td colspan="17" style="background-color:#d0d0d0; height:3px;"></td></tr>
191
+ <tr><td>SimCC-ResNet50*</td><td>COCO</td><td>17</td><td>73.4</td><td>79.0</td><td>39.8</td><td>52.4</td><td>83.2</td><td>86.2</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.3</td><td>0.3</td><td>43.29</td><td>12.42</td></tr>
192
+ <tr><td>RTMPose-x*</td><td>Body8</td><td>26</td><td>78.8</td><td>83.4</td><td>80.0</td><td>84.4</td><td>88.6</td><td>90.6</td><td>88.4</td><td>91.4</td><td>0.0</td><td>0.0</td><td>21.0</td><td>22.9</td><td>50.00</td><td>17.29</td></tr>
193
+ <tr><td>RTMW-l*</td><td>Cocktail14</td><td>133</td><td>75.6</td><td>80.4</td><td>65.4</td><td>70.1</td><td>86.0</td><td>88.3</td><td>85.6</td><td>89.2</td><td>0.0</td><td>0.0</td><td>8.1</td><td>8.1</td><td>57.20</td><td>7.91</td></tr>
194
+ <tr><td>RTMW-l*</td><td>Cocktail14</td><td>133</td><td>77.2</td><td>82.3</td><td>66.6</td><td>71.8</td><td>87.3</td><td>89.9</td><td>88.3</td><td>91.3</td><td>0.0</td><td>0.0</td><td>8.6</td><td>8.6</td><td>57.35</td><td>17.69</td></tr>
195
+ <tr style="background-color:#e6e6e6; font-weight:bold;"><td>SpinePose-x*</td><td>SpineTrack</td><td>37</td><td>75.9</td><td>80.1</td><td>77.6</td><td>81.8</td><td>86.3</td><td>88.5</td><td>86.3</td><td>89.7</td><td>89.3</td><td>91.0</td><td>88.9</td><td>89.9</td><td>50.69</td><td>17.37</td></tr>
196
+ </tbody>
197
+ </table>
198
+
199
+ ## SpineTrack Dataset
200
+
201
+ The **SpineTrack** dataset comprises both real and synthetic data:
202
+
203
+ - **SpineTrack-Real**: Annotated natural images with nine detailed spinal landmarks in addition to COCO joints.
204
+ - **SpineTrack-Unreal**: Synthetic subset rendered in Unreal Engine with biomechanically aligned OpenSim annotations.
205
+
206
+ To download:
207
+
208
+ ```bash
209
+ git lfs install
210
+ git clone https://huggingface.co/datasets/saifkhichi96/spinetrack
211
+ ```
212
+
213
+ Alternatively, use `wget` to download the dataset directly:
214
+
215
+ ```bash
216
+ wget https://huggingface.co/datasets/saifkhichi96/spinetrack/resolve/main/annotations.zip
217
+ wget https://huggingface.co/datasets/saifkhichi96/spinetrack/resolve/main/images.zip
218
+ ```
219
+
220
+ In both cases, the dataset will download two zipped folders: `annotations` (24.8 MB) and `images` (19.4 GB), which can be unzipped to obtain the following structure:
221
+
222
+ ```plaintext
223
+ spinetrack
224
+ β”œβ”€β”€ annotations/
225
+ β”‚ β”œβ”€β”€ person_keypoints_train-real-coco.json
226
+ β”‚ β”œβ”€β”€ person_keypoints_train-real-yoga.json
227
+ β”‚ β”œβ”€β”€ person_keypoints_train-unreal.json
228
+ β”‚ └── person_keypoints_val2017.json
229
+ └── images/
230
+ β”œβ”€β”€ train-real-coco/
231
+ β”œβ”€β”€ train-real-yoga/
232
+ β”œβ”€β”€ train-unreal/
233
+ └── val2017/
234
+ ```
235
+
236
+ All annotations follow the COCO format, directly compatible with MMPose, Detectron2, or similar frameworks.
237
+
238
+ The synthetic subset was primarily employed within the **active learning pipeline** used to bootstrap and refine annotations for real-world images.
239
+ All released **SpinePose** models were trained exclusively on the **real** portion of the dataset.
240
+
241
+ > [!WARNING]
242
+ > A small number of annotations in the synthetic subset are corrupted.
243
+ > We recommend avoiding their use until the updated labels are released in the next dataset version.
244
+
245
+ ## Citation
246
+
247
+ If you use SpinePose or SpineTrack in your research, please cite:
248
+
249
+ **BibTeX:**
250
+
251
+ ```bibtex
252
+ @InProceedings{Khan_2025_CVPR,
253
+ author = {Khan, Muhammad Saif Ullah and Krau{\ss}, Stephan and Stricker, Didier},
254
+ title = {Towards Unconstrained 2D Pose Estimation of the Human Spine},
255
+ booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
256
+ month = {June},
257
+ year = {2025},
258
+ pages = {6171-6180}
259
+ }
260
+ ```
261
+
262
+ **APA:**
263
+
264
+ _Khan, M. S. U., Krauß, S., & Stricker, D. (2025). Towards Unconstrained 2D Pose Estimation of the Human Spine. In Proceedings of the Computer Vision and Pattern Recognition Conference (pp. 6172-6181)._
265
+
266
+ ## Model Card Contact
267
+
268
+ [Muhammad Saif Ullah Khan]([email protected])