English
art
icon
Eval Results
likaixin commited on
Commit
fda7cb4
·
verified ·
1 Parent(s): c84e1d1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -1
README.md CHANGED
@@ -36,4 +36,40 @@ model-index:
36
 
37
  A CLIP ViT-B/32 model trained with the [IconStack dataset](https://huggingface.co/datasets/likaixin/IconStack-Captions-48M) using [OpenCLIP](https://github.com/mlfoundations/open_clip).
38
 
39
- It scores 80.24% on zero-shot classification on [icon-dataset](https://huggingface.co/datasets/likaixin/ui-icon-dataset).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
 
37
  A CLIP ViT-B/32 model trained with the [IconStack dataset](https://huggingface.co/datasets/likaixin/IconStack-Captions-48M) using [OpenCLIP](https://github.com/mlfoundations/open_clip).
38
 
39
+ It scores 80.24% on zero-shot classification on [icon-dataset](https://huggingface.co/datasets/likaixin/ui-icon-dataset).
40
+
41
+
42
+ ## Installation
43
+ You need to install `open_clip` to use this model:
44
+ ```bash
45
+ pip install open_clip_torch
46
+ ```
47
+
48
+ ## Icon-to-Text Zero-Shot Classification
49
+
50
+ ```python
51
+ import torch
52
+ from PIL import Image
53
+ import open_clip
54
+
55
+ CLIP_TEXT_TEMPLATE = "an icon of {}"
56
+ ICON_CLASSES = ["add", "close", "play", ...] # Modify your class names here
57
+
58
+ model_checkpoint = "<path_to_your_local_model>"
59
+ model, _, preprocess = open_clip.create_model_and_transforms('ViT-B-32', pretrained=model_checkpoint)
60
+ model.eval()
61
+ tokenizer = open_clip.get_tokenizer('ViT-B-32')
62
+
63
+ image = preprocess(Image.open("icon.png")).unsqueeze(0)
64
+ text = tokenizer([CLIP_TEXT_TEMPLATE.format(cls) for cls in ICON_CLASSES])
65
+
66
+ with torch.no_grad(), torch.autocast("cuda"):
67
+ image_features = model.encode_image(image)
68
+ text_features = model.encode_text(text)
69
+ image_features /= image_features.norm(dim=-1, keepdim=True)
70
+ text_features /= text_features.norm(dim=-1, keepdim=True)
71
+
72
+ text_probs = (100.0 * image_features @ text_features.T).softmax(dim=-1)
73
+
74
+ print("Label probs:", text_probs) # prints something like: [[1., 0., 0., ...]]
75
+ ```