Add pipeline tag: video-text-to-text
#1
by
nielsr
HF Staff
- opened
README.md
CHANGED
|
@@ -1,20 +1,21 @@
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
library_name: transformers
|
| 3 |
license: mit
|
| 4 |
model_name: Geo-Sign (Hyperbolic-Token)
|
| 5 |
-
paperswithcode_id: geo-sign-hyperbolic-contrastive-regularisation
|
| 6 |
tags:
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
- CSL-Daily
|
| 13 |
-
- CSL-News
|
| 14 |
-
language:
|
| 15 |
-
- zh
|
| 16 |
task:
|
| 17 |
-
|
|
|
|
| 18 |
---
|
| 19 |
|
| 20 |
# Geo-Sign πβ β π
|
|
@@ -43,8 +44,8 @@ Geo-Sign projects pose-based sign-language features into a learnable **PoincarΓ©
|
|
| 43 |
Compared with the strong Uni-Sign pose baseline, Geo-Sign boosts BLEU-4 by **+1.81** and ROUGE-L by **+3.03** on the CSL-Daily benchmark while keeping privacy-friendly skeletal inputs only.
|
| 44 |
|
| 45 |
## Intended Uses & Scope
|
| 46 |
-
*
|
| 47 |
-
*
|
| 48 |
|
| 49 |
## Evaluation
|
| 50 |
|
|
@@ -58,11 +59,11 @@ Geo-Sign outperforms all previous gloss-free pose-only methods and rivals many R
|
|
| 58 |
|
| 59 |
## Limitations & Ethical Considerations
|
| 60 |
|
| 61 |
-
*
|
| 62 |
-
*
|
| 63 |
-
*
|
| 64 |
-
*
|
| 65 |
-
*
|
| 66 |
|
| 67 |
---
|
| 68 |
|
|
@@ -74,5 +75,4 @@ Geo-Sign outperforms all previous gloss-free pose-only methods and rivals many R
|
|
| 74 |
author={Fish, Edward and Bowden, Richard},
|
| 75 |
journal={arXiv preprint arXiv:2506.00129},
|
| 76 |
year={2025}
|
| 77 |
-
}```
|
| 78 |
-
|
|
|
|
| 1 |
---
|
| 2 |
+
datasets:
|
| 3 |
+
- CSL-Daily
|
| 4 |
+
- CSL-News
|
| 5 |
+
language:
|
| 6 |
+
- zh
|
| 7 |
library_name: transformers
|
| 8 |
license: mit
|
| 9 |
model_name: Geo-Sign (Hyperbolic-Token)
|
|
|
|
| 10 |
tags:
|
| 11 |
+
- sign-language-translation
|
| 12 |
+
- skeleton-based
|
| 13 |
+
- hyperbolic-geometry
|
| 14 |
+
- mT5
|
| 15 |
+
paperswithcode_id: geo-sign-hyperbolic-contrastive-regularisation
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
task:
|
| 17 |
+
- sign-language-translation
|
| 18 |
+
pipeline_tag: video-text-to-text
|
| 19 |
---
|
| 20 |
|
| 21 |
# Geo-Sign πβ β π
|
|
|
|
| 44 |
Compared with the strong Uni-Sign pose baseline, Geo-Sign boosts BLEU-4 by **+1.81** and ROUGE-L by **+3.03** on the CSL-Daily benchmark while keeping privacy-friendly skeletal inputs only.
|
| 45 |
|
| 46 |
## Intended Uses & Scope
|
| 47 |
+
* **Primary** β Sign-language-to-text translation research, especially for resource-constrained or privacy-sensitive settings where RGB video is unavailable.
|
| 48 |
+
* **Out-of-scope** β Real-time production deployments without reliable pose estimation, medical or legal interpretations, or languages beyond datasets the model was trained on.
|
| 49 |
|
| 50 |
## Evaluation
|
| 51 |
|
|
|
|
| 59 |
|
| 60 |
## Limitations & Ethical Considerations
|
| 61 |
|
| 62 |
+
* **Pose-estimation dependency** β Errors in upstream key-points propagate to the translation.
|
| 63 |
+
* **Training latency** β Hyperbolic operations slow training (~4β6 Γ) but add **no** cost at inference.
|
| 64 |
+
* **Generalisation** β Evaluated only on Chinese Sign Language; other sign languages are not guaranteed.
|
| 65 |
+
* **Mis-translation risk** β Automatic SLT can mis-communicate; keep a human in the loop for critical use cases.
|
| 66 |
+
* **Biases** β CSL-Daily is domain-specific (news/TV); outputs may reflect that linguistic style.
|
| 67 |
|
| 68 |
---
|
| 69 |
|
|
|
|
| 75 |
author={Fish, Edward and Bowden, Richard},
|
| 76 |
journal={arXiv preprint arXiv:2506.00129},
|
| 77 |
year={2025}
|
| 78 |
+
}```
|
|
|