Hand-Gesture-19 / README.md

Update README.md

329aaef verified 7 months ago

5.19 kB

	---
	license: apache-2.0
	datasets:
	- cj-mills/hagrid-classification-512p-no-gesture-150k
	language:
	- en
	base_model:
	- google/siglip2-so400m-patch14-384
	pipeline_tag: image-classification
	library_name: transformers
	tags:
	- Gesture
	- Classification
	- SigLIP2
	- 19:Styles
	- Vision-Encoder
	---

	![15.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/JBoqEwRBoOQwik0aRYeGw.png)

	# Hand-Gesture-19

	> Hand-Gesture-19 is an image classification vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for a single-label classification task. It is designed to classify hand gesture images into different categories using the SiglipForImageClassification architecture.

	```py
	Classification Report:
	precision recall f1-score support

	call 0.9889 0.9739 0.9813 6939
	dislike 0.9892 0.9863 0.9877 7028
	fist 0.9956 0.9923 0.9940 6882
	four 0.9632 0.9653 0.9643 7183
	like 0.9668 0.9855 0.9760 6823
	mute 0.9848 0.9976 0.9912 7139
	no_gesture 0.9960 0.9957 0.9958 27823
	ok 0.9872 0.9831 0.9852 6924
	one 0.9817 0.9854 0.9835 7062
	palm 0.9793 0.9848 0.9820 7050
	peace 0.9723 0.9635 0.9679 6965
	peace_inverted 0.9806 0.9836 0.9821 6876
	rock 0.9853 0.9865 0.9859 6883
	stop 0.9614 0.9901 0.9756 6893
	stop_inverted 0.9933 0.9712 0.9821 7142
	three 0.9712 0.9478 0.9594 6940
	three2 0.9785 0.9799 0.9792 6870
	two_up 0.9848 0.9863 0.9855 7346
	two_up_inverted 0.9855 0.9871 0.9863 6967

	accuracy 0.9833 153735
	macro avg 0.9813 0.9814 0.9813 153735
	weighted avg 0.9833 0.9833 0.9833 153735
	```

	![download (2).png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/BhwQi6V5Qzl3g33OvRsWz.png)


	The model categorizes images into nineteen hand gestures:
	- Class 0: "call"
	- Class 1: "dislike"
	- Class 2: "fist"
	- Class 3: "four"
	- Class 4: "like"
	- Class 5: "mute"
	- Class 6: "no_gesture"
	- Class 7: "ok"
	- Class 8: "one"
	- Class 9: "palm"
	- Class 10: "peace"
	- Class 11: "peace_inverted"
	- Class 12: "rock"
	- Class 13: "stop"
	- Class 14: "stop_inverted"
	- Class 15: "three"
	- Class 16: "three2"
	- Class 17: "two_up"
	- Class 18: "two_up_inverted"

	# Run with Transformers🤗

	```python
	!pip install -q transformers torch pillow gradio
	```

	```python
	import gradio as gr
	from transformers import AutoImageProcessor
	from transformers import SiglipForImageClassification
	from transformers.image_utils import load_image
	from PIL import Image
	import torch

	# Load model and processor
	model_name = "prithivMLmods/Hand-Gesture-19"
	model = SiglipForImageClassification.from_pretrained(model_name)
	processor = AutoImageProcessor.from_pretrained(model_name)

	def hand_gesture_classification(image):
	"""Predicts the hand gesture category from an image."""
	image = Image.fromarray(image).convert("RGB")
	inputs = processor(images=image, return_tensors="pt")

	with torch.no_grad():
	outputs = model(**inputs)
	logits = outputs.logits
	probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()

	labels = {
	"0": "call",
	"1": "dislike",
	"2": "fist",
	"3": "four",
	"4": "like",
	"5": "mute",
	"6": "no_gesture",
	"7": "ok",
	"8": "one",
	"9": "palm",
	"10": "peace",
	"11": "peace_inverted",
	"12": "rock",
	"13": "stop",
	"14": "stop_inverted",
	"15": "three",
	"16": "three2",
	"17": "two_up",
	"18": "two_up_inverted"
	}
	predictions = {labels[str(i)]: round(probs[i], 3) for i in range(len(probs))}

	return predictions

	# Create Gradio interface
	iface = gr.Interface(
	fn=hand_gesture_classification,
	inputs=gr.Image(type="numpy"),
	outputs=gr.Label(label="Prediction Scores"),
	title="Hand Gesture Classification",
	description="Upload an image to classify the hand gesture."
	)

	# Launch the app
	if __name__ == "__main__":
	iface.launch()
	```

	# Intended Use:

	The Hand-Gesture-19 model is designed to classify hand gesture images into different categories. Potential use cases include:

	- Human-Computer Interaction: Enabling gesture-based controls for devices.
	- Sign Language Interpretation: Assisting in recognizing sign language gestures.
	- Gaming & VR: Enhancing immersive experiences with hand gesture recognition.
	- Robotics: Facilitating gesture-based robotic control.
	- Security & Surveillance: Identifying gestures for access control and safety monitoring.