laion/CLIP-ViT-B-32-256x256-DataComp-s34B-b86K · Deployment of Model with vector Database

laion

CLIP-ViT-B-32-256x256-DataComp-s34B-b86K

Helloo

i have two rather pragmatic questions:

Is there a leaderboard fo text/image retrival in german and how does this model perform in german in your opinion?
-> my tests were quite promising here but i dont have a good comparison/metric
How to use this model with a vector database (like milvus)?
-> Can i just get the vectors like in the code sample:

with torch.no_grad(), torch.autocast("cuda"):
image_features = model.encode_image(image)
text_features = model.encode_text(text)
image_features /= image_features.norm(dim=-1, keepdim=True)
text_features /= text_features.norm(dim=-1, keepdim=True)

and perfrom a cosine similarity on ethem to get the distance in milvus?

I have tried this and the results are not so good. I think that either the german language or milvus is the problem.

Thanks in advance for the answer and continue the great work

Kraebz