apple
/

MobileCLIP2-S0

mobileclip

Model card Files Files and versions

xet

Community

fartashf commited on 13 days ago

Commit

3136ea5

verified ·

1 Parent(s): 3f7e49e

Upload folder using huggingface_hub

Browse files

Files changed (1) hide show

README.md +5 -2

README.md CHANGED Viewed

@@ -25,7 +25,7 @@ This repository contains the **MobileCLIP2-S0** checkpoint.
 | Model                                                     | # Seen <BR>Samples (B) | # Params (M) <BR> (img + txt) | Latency (ms) <BR> (img + txt) | IN-1k Zero-Shot <BR> Top-1 Acc. (%) | Avg. Perf. (%) <BR> on 38 datasets |
 |:----------------------------------------------------------|:----------------------:|:-----------------------------:|:-----------------------------:|:-----------------------------------:|:----------------------------------:|
-| [MobileCLIP2-S0](https://hf.co/apple/MobileCLIP2-S0)      |           13           |          11.4 + 42.4          |           1.5 + 1.6           |               71.5                 |                59.7                 |
 | [MobileCLIP2-S2](https://hf.co/apple/MobileCLIP2-S2)      |           13           |          35.7 + 63.4          |           3.6 + 3.3           |               77.2                 |                64.1                 |
 | [MobileCLIP2-B](https://hf.co/apple/MobileCLIP2-B)        |           13           |          86.3 + 63.4          |          10.4 + 3.3           |               79.4                 |                65.8                 |
 | [MobileCLIP2-S3](https://hf.co/apple/MobileCLIP2-S3)      |           13           |         125.1 + 123.6         |           8.0 + 6.6           |               80.7                 |                66.8                 |
@@ -62,8 +62,11 @@ from mobileclip.modules.common.mobileone import reparameterize_model
 model, _, preprocess = open_clip.create_model_and_transforms('MobileCLIP2-S0', pretrained='/path/to/mobileclip2_s0.pt')
 tokenizer = open_clip.get_tokenizer('MobileCLIP2-S0')
 # For inference/model exporting purposes, please reparameterize first
-model = reparameterize_model(model.eval())
 image = preprocess(Image.open("docs/fig_accuracy_latency.png").convert('RGB')).unsqueeze(0)
 text = tokenizer(["a diagram", "a dog", "a cat"])

 | Model                                                     | # Seen <BR>Samples (B) | # Params (M) <BR> (img + txt) | Latency (ms) <BR> (img + txt) | IN-1k Zero-Shot <BR> Top-1 Acc. (%) | Avg. Perf. (%) <BR> on 38 datasets |
 |:----------------------------------------------------------|:----------------------:|:-----------------------------:|:-----------------------------:|:-----------------------------------:|:----------------------------------:|
+| [MobileCLIP2-S0](https://hf.co/apple/MobileCLIP2-S0)      |           13           |          11.4 + 63.4          |           1.5 + 3.3           |               71.5                 |                59.7                 |
 | [MobileCLIP2-S2](https://hf.co/apple/MobileCLIP2-S2)      |           13           |          35.7 + 63.4          |           3.6 + 3.3           |               77.2                 |                64.1                 |
 | [MobileCLIP2-B](https://hf.co/apple/MobileCLIP2-B)        |           13           |          86.3 + 63.4          |          10.4 + 3.3           |               79.4                 |                65.8                 |
 | [MobileCLIP2-S3](https://hf.co/apple/MobileCLIP2-S3)      |           13           |         125.1 + 123.6         |           8.0 + 6.6           |               80.7                 |                66.8                 |
 model, _, preprocess = open_clip.create_model_and_transforms('MobileCLIP2-S0', pretrained='/path/to/mobileclip2_s0.pt')
 tokenizer = open_clip.get_tokenizer('MobileCLIP2-S0')
+# Model needs to be in eval mode for inference because of batchnorm layers unlike ViTs
+model.eval()
 # For inference/model exporting purposes, please reparameterize first
+model = reparameterize_model(model)
 image = preprocess(Image.open("docs/fig_accuracy_latency.png").convert('RGB')).unsqueeze(0)
 text = tokenizer(["a diagram", "a dog", "a cat"])