Deployment of Model with vector Database
Helloo
i have two rather pragmatic questions:
- Is there a leaderboard fo text/image retrival in german and how does this model perform in german in your opinion?
-> my tests were quite promising here but i dont have a good comparison/metric - How to use this model with a vector database (like milvus)?
-> Can i just get the vectors like in the code sample:
with torch.no_grad(), torch.autocast("cuda"):
image_features = model.encode_image(image)
text_features = model.encode_text(text)
image_features /= image_features.norm(dim=-1, keepdim=True)
text_features /= text_features.norm(dim=-1, keepdim=True)
and perfrom a cosine similarity on ethem to get the distance in milvus?
I have tried this and the results are not so good. I think that either the german language or milvus is the problem.
Thanks in advance for the answer and continue the great work
Kraebz
Hi Kraebz,
there are benchmarks at https://github.com/LAION-AI/CLIP_benchmark that support german and where you can try it on, please have a look there.
This model itself is only trained on english though.
You can try NLLB CLIP (https://arxiv.org/abs/2309.01859) which is multilingual.
See also https://github.com/mlfoundations/open_clip/blob/main/docs/openclip_multilingual_retrieval_results.csv for results
of different models on multilingual benchmarks.
Best,
Mehdi